Prediction of subcellular localizations using amino acid composition and order.

نویسندگان

  • Y Fujiwara
  • M Asogawa
چکیده

Subcellular localization is important for proteins to function. For the prediction of subcellular localizations, we have developed a method, SortPred, using the amino acid composition and order. The composition represents the global features, e.g., the amino acid composition in the full or partial sequences, while the order represents the local features, e.g., the amino acid sequence order. The former was represented by neural networks and the latter was represented by a hidden Markov model. This method predicted the signal peptides (SP), the mitochondrial targeting peptides (mTP), the chloroplast transit peptides (cTP), and the nuclear or cytosolic sequences (other) comparing together the previous methods, this method achieved slightly higher prediction accuracy, 86% for plant and 91% for non-plant. We analyzed the trained neural networks and hidden Markov models and found out that these models well represent the biological features of the sequences.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MultiLoc: prediction of protein subcellular localization using N-terminal targeting sequences, sequence motifs and amino acid composition

MOTIVATION Functional annotation of unknown proteins is a major goal in proteomics. A key annotation is the prediction of a protein's subcellular localization. Numerous prediction techniques have been developed, typically focusing on a single underlying biological aspect or predicting a subset of all possible localizations. An important step is taken towards emulating the protein sorting proces...

متن کامل

Using N-terminal targeting sequences, amino acid composition, and sequence motifs for predicting protein subcellular localizations

Functional annotation of unknown proteins is a major goal in proteomics. A key step in this annotation process is the definition of a protein’s subcellular localization. As a consequence, numerous prediction techniques for localization have been developed over the years. These methods typically focus on a single underlying biological aspect or predict a subset of all possible subcellular locali...

متن کامل

Improving Protein Localization Prediction Using Amino Acid Group Based Physichemical Encoding

Computational prediction of protein localization is one common way to characterize the functions of newly sequenced proteins. Sequence features such as amino acid (AA) composition have been widely used for subcellular localization prediction due to their simplicity while suffering from low coverage and low prediction accuracy. We present a physichemical encoding method that maps protein sequenc...

متن کامل

SubCellProt: Predicting Protein Subcellular Localization Using Machine Learning Approaches

High-throughput genome sequencing projects continue to churn out enormous amounts of raw sequence data. However, most of this raw sequence data is unannotated and, hence, not very useful. Among the various approaches to decipher the function of a protein, one is to determine its localization. Experimental approaches for proteome annotation including determination of a protein's subcellular loca...

متن کامل

A New Ensemble Scheme for Predicting Human Proteins Subcellular Locations

Predicting subcellular localizations of human proteins become crucial, when new unknown proteins sequences do not have significant homology to proteins of known subcellular locations. In this paper, we present a novel approach to develop CE-Hum-PLoc system. Individual classifiers are created by selecting a fixed learning algorithm from a pool of base learners and then trained by varying feature...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Genome informatics. International Conference on Genome Informatics

دوره 12  شماره 

صفحات  -

تاریخ انتشار 2001